16 research outputs found

    Maximizing the Diversity of Exposure in a Social Network

    Full text link
    Social-media platforms have created new ways for citizens to stay informed and participate in public debates. However, to enable a healthy environment for information sharing, social deliberation, and opinion formation, citizens need to be exposed to sufficiently diverse viewpoints that challenge their assumptions, instead of being trapped inside filter bubbles. In this paper, we take a step in this direction and propose a novel approach to maximize the diversity of exposure in a social network. We formulate the problem in the context of information propagation, as a task of recommending a small number of news articles to selected users. We propose a realistic setting where we take into account content and user leanings, and the probability of further sharing an article. This setting allows us to capture the balance between maximizing the spread of information and ensuring the exposure of users to diverse viewpoints. The resulting problem can be cast as maximizing a monotone and submodular function subject to a matroid constraint on the allocation of articles to users. It is a challenging generalization of the influence maximization problem. Yet, we are able to devise scalable approximation algorithms by introducing a novel extension to the notion of random reverse-reachable sets. We experimentally demonstrate the efficiency and scalability of our algorithm on several real-world datasets

    Fair Column Subset Selection

    Full text link
    We consider the problem of fair column subset selection. In particular, we assume that two groups are present in the data, and the chosen column subset must provide a good approximation for both, relative to their respective best rank-k approximations. We show that this fair setting introduces significant challenges: in order to extend known results, one cannot do better than the trivial solution of simply picking twice as many columns as the original methods. We adopt a known approach based on deterministic leverage-score sampling, and show that merely sampling a subset of appropriate size becomes NP-hard in the presence of two groups. Whereas finding a subset of two times the desired size is trivial, we provide an efficient algorithm that achieves the same guarantees with essentially 1.5 times that size. We validate our methods through an extensive set of experiments on real-world data

    Provable randomized rounding for minimum-similarity diversification

    Get PDF
    When searching for information in a data collection, we are often interested not only in finding relevant items, but also in assembling a diverse set, so as to explore different concepts that are present in the data. This problem has been researched extensively. However, finding a set of items with minimal pairwise similarities can be computationally challenging, and most existing works striving for quality guarantees assume that item relatedness is measured by a distance function. Given the widespread use of similarity functions in many domains, we believe this to be an important gap in the literature. In this paper we study the problem of finding a diverse set of items, when item relatedness is measured by a similarity function. We formulate the diversification task using a flexible, broadly applicable minimization objective, consisting of the sum of pairwise similarities of the selected items and a relevance penalty term. To find good solutions we adopt a randomized rounding strategy, which is challenging to analyze because of the cardinality constraint present in our formulation. Even though this obstacle can be overcome using dependent rounding, we show that it is possible to obtain provably good solutions using an independent approach, which is faster, simpler to implement and completely parallelizable. Our analysis relies on a novel bound for the ratio of Poisson-Binomial densities, which is of independent interest and has potential implications for other combinatorial-optimization problems. We leverage this result to design an efficient randomized algorithm that provides a lower-order additive approximation guarantee. We validate our method using several benchmark datasets, and show that it consistently outperforms the greedy approaches that are commonly used in the literature.Peer reviewe

    Social Media for Social Good: Models and Algorithms

    No full text
    Social media employ algorithms to promote content that their users would find interesting, so as to maximize user engagement. Therefore they act as a lens, through which an individual looks at reality, or a "filter". These filters create alternative "digital realities" for participants of social networks. A "filter bubble" refers to the state of ideological isolation resulting from social media personalization algorithms. In this thesis we propose approaches to algorithmically break these filter bubbles. In order to successfully break filter bubbles we come up with methods to detect them, and characterize their strength. First, we look at measuring polarization of opinions, which is a typical manifestation of a filter bubble. Our approach is based on a well-known opinion formation model, and is based on characterizing the random-walk distance of all individuals to the two opposing opinions present in the polarized discussion. We then turn our focus to signed networks, where relationships are characterized by friendship or enmity. We aim to find the maximum possible partition of the graph into two opposing hostile factions. Then, in another line of work, comprising of two papers, we look at measuring the diversity of the exposure of individuals to different opinions. In the first paper, we look at the difference of the values describing information exposure, across all edges in a social graph. In the second, we measure diversity with respect to a model of news item propagation in a network, based on a variant of the well-studied independent cascade model. Subsequently, we propose algorithmic interventions to break filter bubbles, based on the aforementioned measures of polarization and diversity of exposure. Regarding polarization, we consider the task of moderating the opinions of a small subset of individuals in order to minimize polarization. With respect to diversity of exposure, we consider it a beneficial quantity, which should be maximized. Therefore, we consider the problem of maximizing the diversity index, by changing the exposure of a small subset of individuals to the opposite one. Regarding the "lack of diversity of exposure", we define a function to be maximized, that contains its negation. The resulting maximization problem consists of selecting a small subset of individuals to share a set of news articles in their network, starting multiple parallel cascades. Finally, we examine a different type of intervention that does not directly optimize any measure. We organically increase the number of edges in a network, by leveraging the strong triadic closure property, a well known principle from sociology. Given this property, we ask the question "which friendships should be converted from weak to strong in order to maximize the potential for new edges?". For all proposed problems we present a complexity analysis, and in most cases, we offer performance guarantees. We evaluate our methods on real-life social networks and we compare them against some baselines

    Tell me something my friends do not know

    No full text
    | openaire: EC/H2020/654024/EU//SoBigDataSocial media have a great potential to improve information dissemination in our society, yet, they have been held accountable for a number of undesirable effects, such as polarization and filter bubbles. It is thus important to understand these negative phenomena and develop methods to combat them. In this paper we propose a novel approach to address the problem of breaking filter bubbles in social media. We do so by aiming to maximize the diversity of the information exposed to connected social-media users. We formulate the problem of maximizing the diversity of exposure as a quadratic-knapsack problem. We show that the proposed diversity-maximization problem is inapproximable, and thus, we resort to polynomial non-approximable algorithms, inspired by solutions developed for the quadratic knapsack problem, as well as scalable greedy heuristics. We complement our algorithms with instance-specific upper bounds, which are used to provide empirical approximation guarantees for the given problem instances. Our experimental evaluation shows that a proposed greedy algorithm followed by randomized local search is the algorithm of choice given its quality-vs.-efficiency trade-off.Peer reviewe

    Strengthening ties towards a highly-connected world

    No full text
    | openaire: EC/H2020/871042/EU//SoBigData-PlusPlus Funding Information: This research is supported by the Academy of Finland Projects AIDA (317085) and MLDB (325117), the ERC Advanced Grant REBOUND (834862), the EC H2020 RIA Project SoBigData++ (871042), and the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation. Publisher Copyright: © 2021, The Author(s).Online social networks provide a forum where people make new connections, learn more about the world, get exposed to different points of view, and access information that were previously inaccessible. It is natural to assume that content-delivery algorithms in social networks should not only aim to maximize user engagement but also to offer opportunities for increasing connectivity and enabling social networks to achieve their full potential. Our motivation and aim is to develop methods that foster the creation of new connections, and subsequently, improve the flow of information in the network. To achieve our goal, we propose to leverage the strong triadic closure principle, and consider violations to this principle as opportunities for creating more social links. We formalize this idea as an algorithmic problem related to the densest k-subgraph problem. For this new problem, we establish hardness results and propose approximation algorithms. We identify two special cases of the problem that admit a constant-factor approximation. Finally, we experimentally evaluate our proposed algorithm on real-world social networks, and we additionally evaluate some simpler but more scalable algorithms.Peer reviewe

    Tell me something my friends do not know

    No full text
    | openaire: EC/H2020/654024/EU//SoBigDataSocial media have a great potential to improve information dissemination in our society, yet they have been held accountable for a number of undesirable effects, such as polarization and filter bubbles. It is thus important to understand these negative phenomena and develop methods to combat them. In this paper, we propose a novel approach to address the problem of breaking filter bubbles in social media. We do so by aiming to maximize the diversity of the information exposed to connected social-media users. We formulate the problem of maximizing the diversity of exposure as a quadratic-knapsack problem. We show that the proposed diversity-maximization problem is inapproximable, and thus, we resort to polynomial nonapproximable algorithms, inspired by solutions developed for the quadratic-knapsack problem, as well as scalable greedy heuristics. We complement our algorithms with instance-specific upper bounds, which are used to provide empirical approximation guarantees for the given problem instances. Our experimental evaluation shows that a proposed greedy algorithm followed by randomized local search is the algorithm of choice given its quality-vs.-efficiency trade-off.Peer reviewe

    Generalized Leverage Scores: Geometric Interpretation and Applications

    No full text
    | openaire: EC/H2020/871042/EU//SoBigData-PlusPlusIn problems involving matrix computations, the concept of leverage has found a large number of applications. In particular, leverage scores, which relate the columns of a matrix to the subspaces spanned by its leading singular vectors, are helpful in revealing column subsets to approximately factorize a matrix with quality guarantees. As such, they provide a solid foundation for a variety of machine-learning methods. In this paper we extend the definition of leverage scores to relate the columns of a matrix to arbitrary subsets of singular vectors. We establish a precise connection between column and singular-vector subsets, by relating the concepts of leverage scores and principal angles between subspaces. We employ this result to design approximation algorithms with provable guarantees for two well-known problems: generalized column subset selection and sparse canonical correlation analysis. We run numerical experiments to provide further insight on the proposed methods. The novel bounds we derive improve our understanding of fundamental concepts in matrix approximations. In addition, our insights may serve as building blocks for further contributions.Peer reviewe
    corecore